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0.0  Introduction  to  Final  Report 

The  following  is  the  final  report  on  "New  Directions  in  the  Digital 
Signal  Processing  of  Image  Data  (Image  Modeling  for  Exploitation,  Project 
No.  4594)"  Contract  Number  F30602-82-K-0151.  The  report  consists  of  two 
volumes.  The  first  presents  the  work  accomplished  in  descriptive,  but 
summary  form  and  lists  the  publications  and  presentations  supported  by  the 
contract.  This  first  volume  fulfills  the  requirement  of  the  final  report  on 
the  contract.  The  second  volume  is  optional  and  is  a  compilation  of  the 
published  articles,  theses  or  dissertations,  and  project  reports  produced 
from  this  contract.  The  first  volume  is  divided  into  six  major  sections 
corresponding  to  the  six  major  tasks  comprising  this  report.  Within  each 
section  are  described  the  results  of  the  relevant  task  along  with  its  cited 
references  and  its  publications. 

The  numbering  of  the  compiled  publications  in  the  second  volume  is  the 
same  as  that  in  the  first  volume  for  ease  of  reference.  The  accomplishments 
of  the  contract  were  voluminous  and  truly  noteworthy,  as  evidenced  by  sheer 
magnitude  of  the  entire  report  and  the  number  of  published  articles  and 
theses  emanating  from  it. 
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1.0  TARGET  DETECTION  AND  ROTATION-INVARIANT  PATTERN  RECOGNITION 

1.1  TARGET  DETECTION  BY  CORRELATION 

In  this  task,  we  investigated  the  problem  of  scene  matching  with  a 
reduced  resolution  reference  scene.  The  completed  work  appears  in  the 
Master's  degree  project  (thesis)  report  of  H.  Kidorf  [1.3-1]. 

In  this  thesis  we  investigate  the  problem  of  scene  matching  by  matched 
filtering  when  the  target  scene  and  the  reference  scene  are  of  different 
resolutions.  As  is  well  known,  the  solution  to  the  matched  filter  problem 
under  additive  white  noise  conditions  is  the  correlation  receiver.  The 
effect  on  the  performance  of  this  target  detector  is  desired  under  various 
imaging  conditions:  radial  blurring,  azimuthal  blurring  and  with  additive 
white  noise.  Additionally,  we  examine  system  performance  when  the  reference 
scene  resolution  is  reduced  to  that  equivalent  of  the  target. 

This  investigation  is  composed  of  the  following  steps: 

1.  Generation  of  a  set  of  blurred  test  images 

2.  Correlation  with  a  reference  image  (either  high  resolution  or 
blurred) 

3.  Examination  of  peak  correlation  as  a  function  of  the  degree  of  blur 

A.  Generation  of  noisy  images  and  analysis  as  in  Steps  2  and  3. 

Blurred  images  are  first  produced  in  polar  coordinates  using  a 

combination  of  the  Circular  Harmonic  Transform  (CHT)  to  implement  azimuthal 
blurring,  and  convolution  with  a  gaussian  filter  in  implement  radial 
blurring.  The  resultant  image  is  then  interpolated  to  values  that  comprise  a 


cartesian  grid. 


After  acquiring  a  net  of  images  with  varying  iadial  and  azimuthal  blur, 
noise  is  added  in  various  amounts.  Analysis  is  then  carried  out  to 
determine  the  effects  of  the  varied  parameters  on  the  performance  of  image 
c  ross-correlat ion. 

In  this  thesis  it  was  found  that  compared  to  the  effects  of  azimuthal 
blur,  radial  blur  has  s  lesser  effect  on  the  performance  of  scene  matching 
by  correlation.  A  comparison  of  the  rates  at  which  radial  blur  and 
azimuthal  blur  lessen  the  peak  correlation  of  these  blurred  target  images 
with  the  unbiurred  reference  image  shows  that  radial  blur  degrades  this 
performance  measure  58%  slower  than  azimuthal  blurring  does.  Additionally, 
the  reduction  of  peak  correlation  as  a  function  of  radial  blur  was  found  to 
display  a  threshold-1  ike  behavior,  i . e .  correlation  of  radially  blurred 
images  with  greater  than  65%  of  their  high  frequency  content  removed  results 
in  the  peak  correlation  being  reduced  by  the  effects  of  blurring  at  a  rate 
more  than  four  times  faster  than  with  those  images  with  blur  levels  below 
this  threshold. 

Investigation  into  the  effect  on  correlation  performance  of  adding 
noise  to  the  images  was  quantified  and  data  presented  so  that  trade-offs  can 
be  considered  between  the  addition  of  noise  and  the  acceptance  of  increased 
blur.  With  use  of  this  data  each  of  the  three  parameters  arc 
interchangeable  (though  not  equivalent)  in  considering  their  effect  on 
cross-correlation  so,  for  instance,  the  reduction  'in  correlation  performance 
due  to  radial  blur  can  alternatively  be  thought  of  in  terms  of  the  addition 
noise  or  in  terms  of  equivalent  azimuthal  blur. 

An  additional  area  of  investigation  was  the  investigation  of  the  effect 
of  performing  scene  matching  by  correlation  using  a  reduced  rest-]  ut  ion 
reference  image  rather  than  matching  the  blurred  test  image  with  a  high 
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resol  ut  i  on  one.  It  was  discovered  that  when  matching  a  slightly  blurred 
image  that  the  degradation  suffered  was  only  slight.  But,  as  the  degree  of 
blur  of  the  test  image  increased,  the  performance  of  correlation  decreased 
signif icantly . 

The  work  in  this  thesis  also  demonstrated  the  ability  to  simulate  the 
real  world  images  obtained  with  limited  resolution  sensors  and  sampled  at 
less  than  optimal  sampling  rates.  This  tool  proved  useful  in  the 
investigation  into  the  effects  of  lessened  resolution  on  image  correl:.-  ;  ion. 

1.2  ROTATION-INVARIANT  PATTERN  RECOGNITION 


The  problem  of  recognizing  a  target  pattern  regardless  of  its 
orientation  is  the  subject  of  this  task. 

1 ,  2 

In  two  recent  interesting  papers  by  Hsu  and  co-workers,  *  the  method 
of  circular  harmonic  function  (CHF)  expansion  is  suggested  for  achieving 
rotation-invariant  pattern  recognition  by  either  optical  or  digital  means. 
This  promising  approach  indeed  does  detect  targets  regardless  of  orientation 
but  seems  to  require  -  at  least  based  on  our  experience  -  high  SNR  and/or 
high  contrast  imagery.  Hsu's  test  statistic,  i.e.,  the  function  of  his 
observation,  is  a  scalar,  a  situation  essentially  forced  on  him  because  of  a 
property  of  his  algorithm. 

In  the  paper  "Rotat i on- inva r  i  ant  Pattern  Recognition  Using  a  Vector 
Reference"  by  R.  Wu  and  1’.  Stark  [1.3-2]  we  present  a  modification  of  the 
CHF  expansion  algorithm  that  enables  us  to  use  vector  statistics.  In  turn, 
using  vector  statistics  we  can  detect  and  locate  targets  in  poor  SNRs  and 
low-contrast  imagery  without  sacrificing  rotation  invariance.  Instead  of 
using  correlation  with  a  single  harmonic  and  establishing  an  optimum 
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expansion  center,  we  use  multiple  harmonics  combined  with  a  less-than- 
optimum  expansion  center. 

In  another  paper  "Rotation-Invariant  Pattern  Recognition  Using  Optimum 
Feature  Extraction"  by  R.  Wu  and  H.  Stark  [1.3-3,  1.3-4]  we  consider  the 
problem  of  detecting  a  target  regardless  of  its  orientation  when  it  is  known 
that  the  target  must  be  from  one  of  two  classes.  We  assume  significant 
random  intraclass  variability,  a  complication  which  requires  techniques  from 
statistical  pattern  recognition  for  amelioration.  The  Foley-Sanur.on 
transformation  for  selecting  optimum  features  from  random  training  samples 
is  used  to  solve  the  problem. 

By  combining  vector  rotation-invariant  signatures  with  sequential 
projection  onto  optimum  subspaces  for  enhancing  class  separability  we  have 
been  able  to  achieve  rotation-invariant  recognition  with  a  high  degree  of 
accuracy.  The  use  of  vector  signatures  enables  correct  recognition  in 
relatively  poor  signal -  to- noise  environments,  where  other  nethods  might 
.3 

fail.  Optimum  feature  extraction  enables  object-class  recognition  when 
strong  statistical  variations  exist  within  the  object  classes.  We  have 
demonstrated  this  property  even  when  one  object  is  embedded  in  another  (as  P 
is  embedded  in  B). 

Some  of  these  gains  are  brought  about  at  the  expense  of  greater 
computational  complexity.  The  number  of  harmonics  N  needed  to  recognize  an 
object  depends  on  the  complexity  of  the  object  and  is  determined  from  tests. 
Thus,  to  a  first  approximation,  our  algorithm  requires  !i  times  as  many 
computations  as  Hsu's.  On  the  other  hand,  wc  do  not  need  to  search  for  a 
proper  expansion  center.  The  derivation  of  the  opt  in- urn  subspaces  for 
feature  selection  is  computationally  intensive  but  is  done  only  once,  i.e., 
before  the  actual  recognition  problem  begins.  The  project  ion  of  the 
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s  igna.  euros 


onto  the  optimum  subspaces  requires  only  a  matrix 


multiplication.  In  the  worst  case,  only  three  sequential  tests  need  to  be 
done  for  separating  two  classes.  In  situations  where  processing  time  must 
be  kept  low,  a  high-speed  array  processor  of  a  microprocessor  dedicated  to 
the  recognition  problen s  can  be  used. 

In  the  two  earlier  papers,  [1.3-2,  1.3-3]  we  considered  the  problem  of 
how  a  rrachine-v  i  s  •' on  system  might  recognize  an  object  in  a  two-dimensional 
(2-0)  scene  (t.g. ,  an  inage)  regardless  of  the  orientation  of  the  object. 
We  called  this  problem  Z-Z  rotation-invariant  pattern  recognition.  There 

A 

have  heel'.  Mirierov.il  approaches  to  this  problem  including  those  of  Hu,  Smith 

,  ..  .  ,  5  ,  ..  .  6  ,  ,  .  .  „ 

and  ..right,  it  xorg  arm  •vu  ,,  wno  usee  moment  invariants.  Fourier 


.p  t. 


re  use 


b  v 
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and  o  t  h  e  ; 


8, <5 


Casasent  and 


Psal  t achieved  perit  ion-,  scale-,  and  rot  at  ion- invariant  2-D  recognition 
by  <  rd.  >r  !tig  p.i  1  a  r  F«m:  tier  •  r  d  Kel  1  !  n  1 1  an  s  forms  .  Arsenault  and  co- 

weikers1’'  have  a  p  p  t  <  a  c  1  e their  problem  in  a  novel  way,  using  the 
ceef  f  ir  ion  r of  a  c  '  rn:  1  a  r-he  'non  ic  function  (CHF)  expansion  for  rotation- 
ir.Vc!  t  iant  tecogr.it  ion.  Tndeed  it  is  their  approach  that  we  have  extended  in 
(or  w«.rk .  Cue  of  t'se  purposes  of  t!  is  paper  is  to  show  that  scale-invariant 
tprogrit  i t  p  car  bt  easily  and  efficiently  incorporated  in  the  CHF  algorithm 
to  yield  robust  rot  at  ion/  seal  e-  Invar  i  atit  recognition. 

A  r.<  con;  1  ex  problem  is  that  of  having  a  machine  recognize  a  three- 
dimensional  (3-D)  object.  Iri  this  ana  most  of  the  proposed  techniques  seem 
t  t  he  ox  tor.  s  >'  one  of  results  of  2-D  investigations.  For  example,  3-D 
recognition  nr'ng  Fourier  descriptor  of  the  boundary  curve  was  used  by 


pi  rha  rrl 
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3-D  moments  that  are 
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invariant  under  position,  size,  and  orientation  changes.  3-D  aircraft 

13 

identification  has  been  investigated  by  using  Fourier  descriptors  and 
14 

moment  invariants. 

Numerous  investigations  in  both  2-D  and  3-D  machine  recognition  are 
cited  in  the  above  references.  A  run.be r  of  papers  dealing  with  2-D  and  3-D 
recognition  were  read  at  a  recent  meeting  on  machine  vision;  brief  abstracts 

of  this  meeting  are  available. ^  In  general,  however,  research  in  3-D 
recognition  illuminates  two  major  problems:  the  storage  of  vast  amounts  of 
reference  data  and  the  computational  complexity  involved  in  establishing  a 
match  between  the  test  object  arid  one  of  the  references.  Therefore 
extracting  and  matching  features  of  images  rather  than  matching  the  images 
themselves  is  the  trend  in  3-D  recognition  research. 

Our  major  objective  of  this  paper  is  to  describe  a  3-D  machine- 
recognition  procedure  that  uses  the  robust  rotation-  and  scale-invariant 
recognition  ability  of  the  C'llF/Kel  1  i  n-t  rans  f  orm  approach,  which  will  be 
described  in  the  next  section.  Of  course,  to  understand  the  3-D  problem  we 
must  necessarily  be  more  restrictive  in  our  assumptions  regarding  the 
degrees  of  freedom  in  the  positions  of  the  test  object.  In  this  regard,  the 
most  critical  assumption  is  that  there  is  only  one  degree  of  rotational 
uncertainty  in  the  object's  position.  If  we  think  of  the  object  as  being  at 
the  origin  of  s  3-D  Cartesian  coordinate  system,  then  we  allow  a  random 
angular  displacement  about  the  y  axis  but  not  about  the  x  and  z  axes. 
Despite  this  restrictive  assumption,  we  believe  that  our  results  can  be 
useful  in  a  number  of  robot-vision  situations. 

The  central  result  of  this  research  is  a  new  data-mapping  procedure 
from  3-D  observation  space  to  2-D  feature  space  based  on  pseudotomography. 
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The  2-D  feature  data  are  organized  into  a  signature  image  from  wl  ich  scale- 
and  rotation-invariant  features  are  extracted.  Scale  invariance  is  obtained 
by  using  the  Mellin  transform;  rotation  invariance  is  obtained  by  using  CHF 
coefficients.  The  particular  signature  image  that  we  chose  has  the 
following  properties:  (1)  it  incorporates  in  one  2-D  image  information 

about  all  the  views  of  the  object,  (2)  it  is  easily  processed  to  obtain 
invariance  with  respect  to  scale,  (3)  it  is  invariant  with  respect  to 
brightness  changes,  (A)  it  is  easily  processed  to  obtain  invariance  with 
respect  to  rotation,  (5)  it  is  invariant  with,  respect  to  lateral  shifts  of 
the  rotation  axis  about  which  the  3-D  object  is  viewed,  (6)  it  is  invariant 
with  respect  to  vertical  translation  of  the  object,  and  (7)  it  gave  robust 
performance.  The  work  appears  in  the  paper,  "Three-Dimensional  Object 
Recognition  from  Multiple  Views"  by  F.  Vfu  and  H.  Stark  [1.3-5], 
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2.0  IMAGE  RECOVERY  FROM  INCOMPLETE  INFORMATION 

2.1  DESCRIPTION  OF  COMPLETE  WORK 

Two  major  journal  articles  [2.2-1,  2.2-2]  and  the  Ph.D.  dissertation  of 
A.  Levi  [2.2-3]  describe  the  results  of  the  investigation  of  image  recovery 
from  incomplete  information.  We  shall  present  the  abstracts  and  conclusions 
of  these  works.  Further  details,  derivations  and  results  are  found  in  the 
body  of  these  works. 

The  first  paper  is  "Signal  Restoration  from  Phase  by  Projections  onto 
Convex  Sets"  (POCS)  by  A.  Levi  and  h.  Stark  [2.2-1].  In  this  paper  we  have 
discussed  the  RFP  (restoration  from  phase)  problem  using  a  new  iterative 
algorithm  known  as  POCS  (projections  onto  convex  sets).  The  method  allows 
from  any  number  of  a  prior  known  image  constraints  to  be  incorporated  in  the 
algorithm  provided  that  these  can  be  associated  with  convex  sets.  We  have 
discussed  methods  of  approximately  optimizing  the  relaxation  parameters  and 
shown  thereby  that  a  significant  improvement  in  performance  can  be  obtained. 

We  have  shown  experimentally  that  the  method  of  POCS  failed  to  provide 
a  unique  restoration  for  signals  that  violated  the  uniqueness  requirements 

discussed  by  Hayes^  and  herein.  Finally,  we  have  compared  POCS  with  the 

2 

well-known  HLO  RFP  algorithm  and  shown  that  the  POCS  algorithm  performs 
essentially  as  well  as  the  other  while  ensuring  strong  convergence  in  the 
finite-dimensional  case.  A  demonstration  in  which  POCS  yielded  convergence 
while  the  HLO  method  did  not  was  furnished. 

The  second  paper  is  "Image  Restoration  by  the  Method  of  Generalized 
Projections  with  Application  to  Restoration  from  Magnitude"  by  A.  Levi  and 
H.  Stark  [2.2-2].  The  method  of  projections  onto  convex  sets  can  be  used  to 
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solve  many  problems  in  image  restoration,  e.g.,  restoration  fiou  phase, 
spectral  extrapolation,  and  signal  recovery  in  computer-aided  tomography. 
However,  image-restoration  problems  involving  nonconvex  constraints  cannot 
be  handled  by  the  method  of  projection  onto  convex  sets  in  a  fashion  that 
ensures  convergence.  The  restot at  ion -f tom-magnitude  (RFM)  problem  is  such  a 
case.  To  handle  the  RFM  as  well  as  other  nonconvex  const raints,  we  describe 
an  algorithm  known  as  generalized  projections  and  discuss  its  properties. 
When  sets  are  nonconvex,  it  is  possible  for  the  algorithm  to  exhibit 
pathological  behavior  that  is  never  manifest  in  convex  projections.  We 
introduce  an  error  criterion  called  the  summed-distance  error  (SDE)  and  show 
under  what  circumstances  the  SDE  is  a  monotonically  decreasing  function  of 
the  number  of  iterations.  Near-optimum  performance  of  the  algorithm  is 
achieved  by  relaxation  parameters.  Comparisons  with  other  RFM  methods  are 
furnished  for  synthetic  imagery. 

o  We  have  found  that  the  number  of  iterations  required  to  restore  an 
image  may  generally  be  reduced  by  30%  or  more  by  using  relaxation 
parameters  compared  with  the  method  of  pure  projections, 
o  The  variation  of  the  SDE  with  relaxation  parameter  X^  shows  the 

following  behavior.  For  a  small  number  of  iterations,  the  minimum 
is  relatively  sharp,  and  relatively  small  deviations  from  the 
optimum  value  of  X^  degrade  the  performance  as  measured  by  the  SDE. 

As  the  number  of  iterations  goes  up,  however,  the  minimum  becomes 
shallower,  and  small  deviations  from  the  optimum  value  have 
relatively  little  effect.  In  nearly  all  cases,  the  range  of  the 
optimum  value  of  X^  was  in  the  range  1.5-3. 0. 
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o  Unlike  with  the  extrapolation  problem,  the  use  of  the  two-level 
constraints  in  the  RFM  problem  gives  much  better  results  than  those 
obtained  without  it. 

o  An  important  observation  is  that  the  SDE  J(f  )  cannot  be  used  either 

n 

as  a  measure  for  the  true  error  or  to  compare  different  algorithms. 

In  fact,  it  can  be  shown  that,  for  the  algorithm,  J  (f  )  is  always 

n 

smaller  than  the  error  II  e  |j  =  ||f  -  f  |{. 

n  "  "  n  " 

o  In  many  cases  we  observed  tunnel  behavior  for  large  interation 
number  n  [i.e.,  the  change  in  f  from  iteration  to  iteration  and  the 

corresponding  change  in  J(f^)  becomes  negligible].  The  poor 

restorations  with  relatively  low  J(f  )  after  1000  iterations 

n 

exhibited  in  some  of  the  examples  support  the  assumption  that  traps 
exist.  These  examples  also  demonstrate  the  importance  of  detecting 
traps  and  tunnels  and  of  finding  ways  of  getting  out  of  them  by 
changing  the  algorithm. 

1.  M.  H.  Hayes,  "The  reconstruction  of  a  multidimensional  sequence  from 
the  phase  or  magnitude  of  its  Fourier  transform,"  IEEE  Trans.  Acoust. 
Speech  Signal  Process.  ASSP-30,  140-154  (1982). 

2.  M.  H.  Hayes,  J.  S.  Lim,  and  A.  V.  Oppenheim,  "Signal  reconstruction 
from  phase  or  magnitude,"  IEEE  Trans.  Acoust.  Speech  Signal  Process. 
ASSP-28,  672-680  (1980). 

The  abstract  of  A.  Levi's  Ph.D.  dissertation  [2.2-3]  follows. 

This  dissertation  deals  with  the  problem  of  restoring  images  from 
incomplete  information  by  the  method  of  alternating  projection  onto  convex 
sets  (POCS)  and  its  extension:  the  method  of  generalized  projections  (MGP). 
the  investigation  discussed  herein  are  both  theoretical  and  experimental. 


< 
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We  demonstrate  the  weak  convergence  property  of  the  POCS  algorithm  when  the 
relaxation  parameters  (RP)  are  varied  from  iteration  to  iteration.  We  then 
demonstrate  how  the  RP's  can  be  optimized  to  accelerate  convergence.  We 
illustrate  the  utility  of  these  theoretical  results  by  restoring  a  finite- 
size  (space-limited)  image  from  the  phase  of  its  Fourier  transform.  This 
problem  is  the  well-known  restoration  from  phase  problem.  We  compare  our 
result  with  the  standard  solution  (the  Hayes-Lim-Oppenheim,  or  HLC1,  method). 
The  superior  convergence  properties  of  the  POCS  method  over  the  HLO  method 
is  demonstrated  by  several  examples  in  which  the  HLO  method  diverges  while 
the  POCS  method  does  not. 

The  MGP  and  its  application  to  problems  which  involve  non-convex  type 
constraints,  is  derived.  The  MGP  is  a  two-step  algorithm  which  possesses 
the  property  of  set-distance-reduction  for  certain  values  of  the  RP's.  We 
show  that  this  property  does  not  extend  to  algorithms  with  more  than  two 
steps  (or  two  operators)  in  one  iteration,  but  the  restriction  to  two  steps 
is  not  very  restrictive  in  practice  because  several  image  constraints  can  be 
combined  into  one  set,  A  performance  measure  called  summed  distance  error 
(SDE)  is  defined  and  is  shown  to  be  useful  for  monitoring  and  controlling 
the  algorithm.  Pathological  phenomena  called  traps  and  tunnels  are  defined 
and  shown  to  occur  in  the  MGP  when  non-convex  sets  are  involved.  They  are 
responsible  for  the  observed  convergence  to  non-valid  solutions  or  very  slow 
convergence  to  valid  ones. 

The  MGP  algorithm  is  applied  to  the  problem  of  restoring  a  signal  when 
we  are  given  only  the  magnitude  of  its  Fourier  transform  (the  restoration 
from  magnitude  (RFM)  problem).  We  show  that  the  MGP  method  incorporates 
other  algorithms  for  the  RFM  problem  such  as  the  Gerchberg-Saxton  (GS) 
method  and  Fienup's  output-output  algorithm.  Methods  of  optimizing  the  RP's 
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on  a  per-step  and  per-cycle  basic  are  described  and  the  per-cycle  method  is 
shown  to  significantly  improve  the  convergence  rate.  The  method  of  MGP  is 
shown  to  be  useful  for  detecting  traps  and  tunnels  and  for  possibly  getting 
out  of  them.  Experimental  demonstrations  showing  the  effects  of  noise  on 
the  restoration  are  also  furnished. 

We  close  the  dissertation  by  proposing  several  unresolved  research 
problems  that  warrant  further  investigations. 
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POST  DETECTION  PROCESSING  OF  PHOTON-LIMITED,  BLURRED  IMAGES 


3.0  INTRODUCTION 

The  focus  of  the  investigation  of  means  to  restore  photon-limited  and 
blurred  images  was  to  develop  new  techniques  where  the  image  statistics  were 
unknown  and  the  processing  could  be  realized  in  real-time  with  dedicated 
hardware.  We  describe  here  three  new  techniques  which  gave  very  good 
results  with  fast  processing. 

3.1  TWO  STEPS  ADAPTIVE,  ROBUST  ESTIMATOR 

The  first  is  a  two-step  procedure  which  restores  low  light-level  images 
regarded  by  a  linear  space-invariant  blur.  The  presentation  of  the  results 
of  the  research  occurred  at  the  SPIE  Annual  International  Symposium  on 
Optics  and  Electro-Optics  in  San  Diego,  California  in  August  1984  with 
publication  of  a  full  article  in  the  symposium's  Proceedings  [1],  As  the 
blurred  image  is  incident  on  the  photo-detector  array,  the  first  step 
attempts  to  estimate  the  blurred  image  from  the  collection  of  photocounts 
from  the  array.  The  optimum  linear  minimum  mean  squared  estimator  (LMMSE)  is 
obtained  by  solving  an  equation  by  Grandell  [2]  for  estimating  the  intensity 
in  a  doubly  stochastic  Poisson  field.  The  error  of  the  optimum  LMMSE 
estimator  is  orthogonal  to  the  photocount  process.  Hence  this  error  may  be 
treated  as  additive  and  uncorrelated  noise  in  the  next  step,  which  forms  an 
estimate  of  the  true  object  from  the  first  steps  estimate  of  the  incident 
(blurred  image)  intensity.  It  is  accomplished  through  an  application  of  thv 
constrained  LMMSE  (deconvolution)  method  set  forth  by  Hunt  [3].  An  adaptive 
windowing  technique  generates  sample  statistics  for  the  two-step  estimator. 
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Hence  the  method  requires  no  prior  knowledge  of  image  statistics  to 
accomplish  recovery  of  the  object.  The  processing  is  very  fast  because  the 
required  sample  statistics  for  the  estimators  and  the  window  size  logic  are 
calculated  recursively  and  the  estimates  are  linear  ones  of  point 
intensities.  A  full  description  of  the  estimators  and  their  derivations  is 
presented  in  the  aforementioned  paper  [1].  Here  we  shall  just  describe  some 
of  the  simulation  results. 

The  system  for  generating  a  blurred,  photon  limited  image  and  the  two- 
step  restoration  procedure  are  illustrated  in  Figure  1.  The  linear  blur 

1 

matrix  H  acting  upon  the  original  image  f  has  elements  h(i,j)  =  —  , 

J  K, 

i=0, 1,  .  .  .  , J-l  and  j=0, 1,  .  .  .  ,K-1 .  The  Poisson  noise  generator  for  a  given 
parameter  q  produces  the  array  of  photocounts  £  =  g(i,j)  which  are  each 

scaled  by  q  for  display  of  the  phot  on- 1  im  i  t ed ,  blurred  image  g^.  The 

received  or  input  signal-to-noise  ratio  is  defined  to  be 

SNR .  =  <.(f  r<-f>i-_> 

1  «gd-f)  > 


where  <x>  denotes  the  spatial  sample  average  over  the  array  x.  The  restored 
or  output  signal-to-noise  SNR^  is  defined  as  above  with  the  estimate  f 

replacing  g^. 

The  two  test  images  used  in  our  simulations  are  the  face  images  of  a 
man  and  woman.  For  each  of  these  images,  Poisson  noise  degradation  with 
different  parameter  q  and  degrees  of  lineal  motion  blur  were  simulated  and 
then  restored.  Shown  in  Figure  2  is  one  typical  example  of  a  degraded  image 
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with  Poisson  noise  parameter  q=0.125  and  4x4  (J  =  K  =  4)  blur  and  its 
restoration.  With  blur,  the  restored  image  is  not  as  sharp  and  appears 
slightly  more  noisy  with  an  SNR  improvement  of  7.3  dB.  The  improvement  with 
blur  is  higher  than  without  blur,  because  the  blurring  procedure  produces 
further  degradation  from  the  original  image  and  the  estimator  partially 
deblurs  it. 

Comparison  with  other  methods  is  complicated  by  the  use  of  different 
test  images  and  degrees  of  Poisson  noise  and  by  inconsistencies  in 
definitions  of  signal-to-noise  ratios.  Another  factor  is  the  amount  of 
processing  or  degree  of  computational  complexity  involved  in  different 
schemes.  For  example,  Lo  and  Sawchuk  used  noisier  test  images  and  a 
different  definition  of  s i g na 1 - t o-no i s e  ratio  [4],  Kasturi  [5]  compares 
several  kinds  of  estimators  on  less  noisy  images  than  ours  and  his  SNR 
calculations  include  the  image  means.  The  better  estimators  give  good 
perceptual  results,  but  use  prior  statistical  knowledge  and  significant 
processing.  The  estimator  here  does  compare  favorably  with  these  efforts, 
however,  especially  considering  the  robustness  and  adaptivity  of  the  method 
and  the  relatively  small  processing  requirements. 

3.2  MINIMUM  ERROR,  MAXIMUM  CORRELATION  ESTIMATOR 

One  of  the  deficiencies  of  current  image  estimators  is  that  the 
estimation  error  or  noise,  although  uncorrelated  with  the  data  image  has 
positive  correlation  with  the  true  object.  In  principle,  therefore,  the 
estimation  noise  can  be  further  processed  to  extract  more  information  about 
the  object.  To  remove  this  deficiency,  a  new  estimator  is  proposed  which 
attempts  simultaneously  to  maximize  the  correlation  between  the  estimate 
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(or,  equivalently,  minimize  the  correlation  between  the  error  and  the 
object)  and  minimize  the  mean  squared  error.  It  turns  out  that  this  is  an 
optimization  problem  with  competing  objectives.  The  solution  is  a 
compromise  between  the  usual  minimum  mean  squared  error  (Wiener)  estimator 
and  an  inverse  filter. 

A  simple  example  to  illustrate  the  need  of  the  double  criterion  is  as 
follows.  Consider  a  scalar  case  where  the  observed  random  variable  Y  equals 
the  sum  of  the  object  variable  (signal)  X  and  independent  noise  N,  i.e., 

2  2 

Y=X+N  and  the  signal  and  noise  variances,  a  and  cr  are  equal.  Assume  that 

x  n 

the  random  variables  have  zero  mean.  Consider  two  possible  estimates  X  of 

2 

X,  X=Y  and  X=0.  The  mean  squared  error  of  the  first  estimate  is  a  and  that 

n 

2 

of  the  second  is  a Pence  their  estimation  errors  are  equal.  From  a 

perceptual  point  of  view,  it  is  certainly  better  to  see  X=Y  than  X=0,  which 
is  no  information  at  all.  One  quantitive  measure  of  the  difference  between 
the  two  estimates  is  the  correlation  of  their  error  with  the  signal.  For 
X=Y  the  correlation  between  the  error  and  the  signal  X  is  zero,  whereas,  for 

2 

X  =  0,  the  correlation  between  the  error  and  the  signal  is  Hence,  for 

equal  error,  we  choose  the  estimator  which  has  the  smaller  correlation 
between  its  error  and  the  signal. 

The  case  of  signal  plus  uncorrelated  noise  can  be  visualized 
vectorially  as  shown  in  Figure  3.  When  we  select  a  linear  estimator  X=aY, 
the  variation  of  the  constant  a  between  0  and  1  moves  the  estimator  from  the 

2  2 

tail  to  the  tip  of  the  vector  Y  from  a  mean  squared  error  of  to  with 


an  interim  minimum  of  e 

min 


a  a  (a  +a  )  when  the  error  vector  is 
x  n  x  n 
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perpendicular  to  Y.  The  correlation  of  the  error  with  the  signal  moves  from 

2 

o  at  the  tail  to  zero  at  the  tip.  Therefore,  if  one  moves  from  the  minimum 
x 

error  point  toward  the  tip  of  Y,  there  is  a  trade-off  of  increased  error  for 
increased  error-signal  correlation.  In  Figure  4  the  mean-squared  error  d ( a ) 
is  plotted  against  the  error-signal  correlation  c(a)  as  a  varies  from  0  to 
1.  The  plot  shows  that  a  maximal  trade-off  point  of  error  for  correlation 
may  be  realized,  depending  on  definitions  of  allowable  range. 

The  convexity  of  the  c_d  locus  versus  a  was  proved  and  the  optimization 
problem  of  minimizing  c(a)  and  d(a)  simultaneously  over  a  takes  the  form  of 

min  [w^ c ( a )  +  w^ta) ] 
a 

with  weighting  coefficients  w  ^  and  w2.  The  choices  of  w ^  and  w 2  are 

subjective,  but  reasonable  choices  are  the  reciprocals  of  the  ranges  of  the 
corresponding  weighted  variables.  The  ultimate  objective  function  is 


f  (a) 


2  ,  .  x  2..  ,  22 

octal  +ad(a)  ,  o>o 
n  x  x-  n 

2  2  2  2 

octal  +  o  d  ( a )  ,  o<o 

x  n  x  n 


with  the  minimizing  a=a  , 

o 


,  2  2.  2  2  2 

maxis  ,o  J  o+o 

x  n  x  n 


resulting  d  and  c  of 


with  e  .  =  o  a  (a^+a  )  ,  the  minimum  mean  squared  error.  When  the 

min  x  n  x  n 

2  2 

s i g n a  1 - t  o-no  i  s e  ratio  o  /o  is  either  very  large  or  very  small,  the 

x  n 

resulting  mean  squared  error  dla^)  is  close  to  the  minimum.  However,  when 

neither  case  applies,  a  small  additional  error  is  traded  for  a  larger  amount 
of  error-signal  correlation.  This  work  and  all  subsequent  derivations  and 
results  in  image  restoration  with  this  double  criterion  appear  in  the  Ph.D. 
dissertation  of  Woo-J  in  Song  [6]  and  [8],  In  his  dissertation  Song  also 
extended  the  previous  scalar  random  variable  case  to  stationary  continuous 
and  discrete  image  fields.  The  resulting  digital  linear  restoration  filter 
characteristic,  which  we  call  the  minimum  error,  maximum  correlation  (MEMO 


filter  is 


A  ( m ,  n ) 


li  ( m ,  n  )  S'  ( m ,  n  ) 


max  [  I!(m,n)  S  (m,n),  S  ( m ,  r )  ] 
x  n 


H  (m,r)  S  (m,  n) 

1  _ x _ _ _ 

ll(ir,n)  Z  S  (r,p)+S  (m,n) 
x  n 


tr -0 , 1  _ .’1-1 

n=0,l, . . . ,N-1 


where  I7(m,n)  is  the  sampled  point  spread  function  of  the  linear  blur  and 
S  (m»n)  and  S  (m,n)  are  the  discrete  power  spectra  of  the  object  and  noise, 

re  spec  t ive 1 y . 

Blurred  images  in  additive  Gaussian  noise  were  simulated  by  the 
computer  and  restored  through  the  above  filter.  The  point  spread  functions 
of  the  blur  were  Gaussian  with  8x8  support  and  lineal  with  4x4  support.  In 
Figures  5  and  6  are  shown  comparative  results  of  estimation  with  the  double¬ 
criterion  (MEMC)  and  the  Wiener  Filters  using  lineal  blur  and  signal-to- 
noise  ratios  of  13  and  10  dB.  It  is  evident  that  the  MEMC  filter  produces 
the  sharper  image  at  the  expense  of  more  "spotting".  At  the  lower  signal- 
to-noise  ratio,  the  sharper  image  even  with  more  noticeable  nspottingn  is 
preferred  by  most  observers  than  the  fuzziness  in  the  Wiener  filtered  image. 
In  Figure  7  are  the  same  filter  comparisons  with  no  blur  and  0  dB  SNR.  Here 
the  relative  sharpness  of  the  double-criterion  restored  image  is  even  more 
pronounced . 


3.3  ADAPTIVE  WINDOWING  AND  NONLINEAR  FILTERING 


Traditional  and  new  estimators  which  derive  their  spectral  estimates  by 


averaging  over  regions  of  an  image  produce  restorations  of  noisy  images 


which  are  noisy  in  the  flat  areas  due  to  insufficient  smoothing  and  somewhat 


fuzzy  in  the  edge  regions  due  to  attempted  noise  smoothing.  The  problem 


arises  partly  because  the  regions  over  which  the  spectral  parameters  are 
estimated  (the  so-called  analysis  windows)  contain  both  flat  areas  and  edges 
and  are  appropriate  for  an  average  of  both,  but  not  separately  for  either. 
If  the  analysis  window  contains  elements  truly  representative  of  the  current 
picture  element,  then  the  estimate  of  the  true  object  intensity  at  the  point 
would  be  improved.  A  new  scheme  for  adaptive  windowing  was  tried  in 
conjunction  with  subsequent  non-linear  filtering.  We  will  describe  the 
technique  briefly  and  show  some  of  the  results,  which  are  truly  impressive. 

Let  us  assume  for  the  moment  one-dimensional  data  where  the  degradation 

2 

is  additive,  uncorrelated  stationary  noise  of  variance  o  .  Assume  a  window 
of  length  has  been  set  for  the  kth  data  element.  An  activity  index  is 
computed  from  the  sample  variance  in  that  window.  If  <  T^,  a  threshold 
inversely  proportional  to  L,  ,  then  the  next  window  L,  „  =  L,  +2  unless  it 
has  reached  a  preset  maximum.  If  2  T^,  then  the  window  length  is 
decremented  by  two.  The  sample  mean  m^  and  sample  variance  are 

calculated  within  the  window  straddling  the  k**1  image  element  of  intensity 
y^  to  produce  the  signal  estimate  x^  by 

max[0,V^-o^] 

xk  =  “k  +  \T  (Vmk)- 

Although  the  estimate  appears  linear  in  its  form,  it  is  nonlinear  because  of 
the  nonlinear  dependence  on  V^,  which  in  turn  is  a  nonlinear  function  of  the 
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data.  The  mean,  variance,  window  length,  and  error  can  be  computed 
recursively,  thereby  reducing  the  computational  requirements.  The  extension 
of  the  scheme  to  two  dimensions  is  straightforward,  except  that  the  shape  of 
the  analysis  window  needs  specification.  For  simplicity,  square  windows  are 
chosen. 

This  adaptive,  nonlinear  estimator  was  applied  successively  to  a  noisy 
one-dimensional  simulated  waveform,  a  scan  line  of  a  noisy  image,  noisy 
images  using  one-dimensional  windows,  and  images  using  square  windows.  In 
all  cases,  the  edge  or  detail  features  are  preserved  and  the  flat  contrast 
areas  show  almost  imperceptible  noise.  In  fact,  the  image  restorations  are 
remarkably  sharp  and  clean.  One  example  is  presented  here  to  give  evidence 
to  those  assertions.  Shown  in  Figure  8  is  the  original  mans  face  image  and 


the  same  image  with  additive,  white  Gaussian  noise  of  variance  o  =  489.  In 
Figure  9  appear  restorations  by  the  new  adaptive  window,  nonlinear  filter 
and  the  minimum  mean-squared  error  or  Wiener  filter.  Note  the  clarity  and 
sharpness  of  the  new  filter's  restoration  compared  to  that  of  the  Wiener 
filter.  It  is  also  surprising  that  the  mean-squared  error  of  the  Wiener 
filter  is  smaller  (82.22)  than  that  of  the  new  filter  (87.76).  Such  a 
result  is  convincing  evidence  that  mean  squared  error  is  not  a  good  measure 
of  visual  error.  Details  and  results  appear  in  [9]  and  the  Ph.D. 
dissertation  of  W.  J.  Song  [6]. 

The  adaptive  window  algorithm  can  also  be  utilized  in  other  estimation 
paradigms  .  In  this  regard,  we  also  incorporated  it  into  the  two  step 
estimation  procedure  for  blurred,  photon-limited  images  described  earlier. 

The  estimate  b^  of  blurred  intensity  b^  at  the  k**1  image  point  from  the 


number  of  photodetector  counts 


8k 


is 
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m^/A  ,  if  =  0 
2 

mk  +  max  [0,  VR  -  mj  gk 
^Amax  [Vk,  mk] 


,  otherwise 


where  X  is  the  Poisson  rate  parameter,  and  m^  and  are  the  estimates  of 

the  mean  and  variance,  respectively,  of  the  image  at  point  k  taken  in  the 
analysis  window.  The  mean-squared  error,  used  as  the  noise  in  constrained 
deconvolution  of  the  blur  in  the  second  step,  is 


d 


k 


/• 

mk/A2  .  if  Vk  =  0 


mk  max  [0,  Vk  -  mk) 


,  otherwise 


The  derivations  of  these  formulas  and  the  subsequent  results  appear  in 
Song’s  dissertation  [6].  A  sample  of  these  results  appear  in  Figure  10.  In 
Fig.  10(a)  is  a  test  image  degraded  by  4x4  lineal  motion  blur  and  corrupted 
by  Poisson  noise  with  q=0.5.  The  result  of  the  first-step  estimation  of  the 
blurred  image  appears  in  Fig.  10(b).  The  blurred  signal  to  noise  ratio 
(SNR)  increased  by  6.6  dB  from  10.9  to  17.5  dB.  The  second  step  constrained 
deconvolution  of  the  blurred  image  plus  the  estimation  noise  appears  in  Fig. 
10(c)  for  one  value  of  a  constraint  parameter.  There  is  considerable 
improvement  in  visual  quality  from  both  steps  of  the  restoration  procedure. 
It  is  evident  that  adaptive  windowing  produces  superior  reconstructions  than 
the  fixed  window  algorithm  given  in  Section  3.1. 
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Figure  1:  System  for  simulation  and  restoration  of  photon-limited, 


blurred  mages 


Figure  2:  Degraded  and  restored  Man  images 

(A)  Noisy  image,  no  blur  n=-125,  SNR*7.9dB 

(B)  Restoration  of  (A).  SNRr=14.5dB 

(C)  Noisy  image,  4x4  blur  ri  = .  1 25^  SNR^=6 . 5dB 

(D)  Restoration  of  (C),  SNR^=13.8dB 
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Figure  3 
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Vector  representation  of  v=x+n  and  estimator 


x=ay  with  x  and  n  orthogonal 
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(.i) 


(b)  (c) 


5:  Restoration  of  image  degraded  by  4x4  i ineal  motion  blur 

and  additive  white  Gaussian  noise  of  variance  0^=54 
( SNR= 1 7 d B )  :  (a)  blurred,  noisy  image;  (b)  image  restored 

by  Wiener  filter;  (c)  image  restored  by  MEMO  filter. 
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Figrue  6:  Restoration  of  image  degraded  by  4x4  lineal  motion  blur 


and  additive  white  Gaus..ian  noise  of  variance  o  =135 

n 

(SNR=13dB):  (a)  blurred,  noisy  image;  (b)  image  restored 


by  Wiener  filter;  (c)  image  restored  by  MEMC  filter. 


-  J4 - 


AWGN :  SNR  =  0  dB,  no  blur. 


•igure  7:  Simulated  noisy  image  and  restorations  by  Wiener 

and  MEMC  filters.  Top  is  noisy  image,  bottom  left 
is  Wiener,  and  bottom  right  is  MEMC  restoration. 
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Figure  8:  Original  and  noisy  mans  face  image.. 

(a)  Original  "mans  face"  image 

2 

(b)  Lmage  in  additive  noise  of  0  489 

( S/N= 1 OdB) . 
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Figure  9:  Comparison  of  restorations  of  image  in  Figure  8ib). 

(a)  Filtered  "face"  image  by  J-D  adaptive  filtes 

(b)  "Face"  image  restored  by  Wiener  Filter. 


(a)  Photon- l imi ted  image  with  4x4  motion  blur,  (*=0.5) 

(b)  Estimated  image  from  first  step  (before  deblurring) 

(c)  Final  restored  image  alter  second  step. 
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4.0  FAST  HIERARCHICAL  ALGORITHMS  FOR  MOTION  ANALYSIS 

The  results  of  this  task  are  documented  in  detail  in  a  Master's  degree 
thesis  [4.3-1],  and  a  Masters  degree  project  report  [4.3  3].  The  former 
presents  a  comparative  study  of  correlation  measures  used  in  motion  analysis 
and  then  proposes  a  multi-step  approach.  The  latter  report  presents  a 
method  of  motion  representation  based  on  image  segmentation  using  pyramidal 
data  structures  and  shows  how  the  algorithm  work  for  different  image 
sequences.  Excerpts  from  these  reports  follows. 

4.1  CORRELATION  AND  THE  MULT I -RE SOLUTION,  FLOW-THROUGH  ALGORITHM 

Correlation  is  commonly  used  in  motion  analysis  to  discover  the 
correspondence  among  local  image  patterns  in  a  sequence  of  image  frames. 
However  a  variety  of  such  measures  can  be  employed  and  these  may  differ 
widely  in  performance  and  in  computational  cost.  In  this  first  half  of  this 
study  [4.1-1]  five  basic  correlation  measures  are  compared.  Performance  is 
determined  empirically  as  a  function  of  image  content  (spectrum)  and  in  the 
presence  of  various  types  of  image  degradation. 

A  multi-step  approach  to  motion  analysis  is  proposed  in  the  second  half 
of  the  study.  Decomposition  of  the  analysis  into  distinct  steps  and 
independent  channels  means  that  the  computations  may  be  performed  in 
parallel  pipelined  hardware.  Data  is  said  to  "flow-through"  the  system 
because  analysis  is  uniform:  a  fixed  sequence  of  operations  is  applied  to 
all  images  and  to  all  positions  within  each  image,  independent  of  image 


content . 
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4.1.1  PERFORMANCE  OF  LOCAL  CORRELATION  MEASURES 

Correlation  between  local  regions  of  two  or  more  images  may  be  used  to 
detect  pattern  displacements,  and  hence  object  motion.  However  correlation 
can  yield  incorrect  pattern  matches.  Of  the  five  correlation  measures 
examined,  direct  correlation  consistently  gave  the  highest  error  rates. 
Performance  was  particularly  poor  when  the  mean  value  of  a  local  pattern  was 
large  compared  to  its  variance,  or  when  patterns  to  be  matched  differed  in 
mean  or  contrast.  Variance  normalized  correlation  consistently  gave  the 
lowest  error  rates.  No  errors  were  obtained  with  this  measure  in  the 
absence  of  relative  image  distortion.  However,  in  the  presence  of  noise  its 
performance  was  only  slightly  better  than  other  measures.  Since  the  other 
measures  require  considerably  less  computation  than  variance  normalized 
correlation  these  may  be  preferred  under  appropriate  image  conditions. 
Correlation  following  Laplacian  filtering  is  particularly  effective  when  the 
image  contains  high  frequency  components  and  any  noise  is  restricted  to  low 
frequencies.  Even  the  binary  correlation  performs  well  under  these 
conditions.  However  both  measures  are  very  susceptible  to  high  frequency 
noise.  Therefore,  the  Laplacian  correlation  appears  to  be  the  most 
attractive  of  five  measures  studied.  However  this  measure  should  not  be 
used  in  the  presence  of  high  frequency  noise. 

4.1.2  PERFORMANCE  OF  MMF 


We  have  described  a  system  in  which  motion  analysis  is  carried  out  in 
three  distinct  steps.  First,  each  image  in  a  motion  sequence  is  passed 
through  a  set  of  filters  to  obtain  a  corresponding  set  of  band-pass  copies. 


Second,  a  optimum  nine  "motion  channels"  are  formed  within  each  frequency 
band  from  original  twenty-five  channels,  each  "tuned"  to  motion  in  a 
particular  direction  and  velocity.  Finally,  the  outputs  of  these  simple 
channels  are  combined  in  groups  of  three  to  form  four  compound  motion 
channels,  each  tuned  to  a  particular  orientation,  the  outputs  of  the  four 
orientation  channels  are  compared  and  combined  into  a  single  motion  estimate 
for  each  sample  position  within  the  frequency  band.  Each  group  of  three 
channels  has  two  outputs,  a  velocity  estimate  and  a  confidence  measure.  By 
comparing  channel  outputs,  it  is  possible  to  determine  whether  motion  is 
within  textured  image  region,  near  a  prominent  image  contour,  or  near  an 
occluding  edge.  Channel  outputs  are  combined  in  the  manner  appropriate  for 
each  of  these  cases  in  step  3. 

A. 1.3  CONCLUDING  REMARKS 

As  demonstrated  by  the  results,  the  combination  of  Laplacian  pyramid 
structures  together  with  the  MMF  strategy  has  proven  to  be  a  powerful 
technique  for  motion  analysis.  The  power  of  the  technique  developed  in  the 
study  lies  in  its  simplicity,  computational  efficiency  and  its  high 
accuracy. 

A. 2  MULTI-RESOLUTION  SPLIT  AND  LINK  SEGMENTATION  ALGORITHM 

The  need  for  representing  motion  in  digital  images  is  motivated  by 
applications  in  transmission  of  video  and  in  artificial  intelligence.  In 
the  first  type  of  application,  bandwidth  compression  in  the  image  processing 
sense  is  sought;  we  attempt  to  transmit  less  data  to  communicate  the 
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required  information  about  the  image.  In  the  second  type  of  application,  we 
attempt  to  give  the  computer  a  primitive  understanding  of  how  objects  move 
and  an  ability  to  track  this  motion  without  necessarily  recognizing  the 
objects.  A  method  of  motion  representation  based  on  image  segmentation 
which  lends  itself  to  these  applications  is  presented  in  this  work  [4.1-3] 
along  with  results  from  its  software  implementation. 

The  segmentation  scheme  presented  here  gives  excellent  results  for  a 
variety  of  images.  The  segments  in  the  image  are  represented  by  links  and 
segment  values.  The  information  provided  by  these  links  and  segment  values 
is  contained  in  a  pyramid  structure  which  is  suitable  for  analysis.  Our 
motion  algorithm  uses  the  same  representation  for  an  image  as  the 
segmentation  algorithm  and  is  capable  of  updating  the  links  and  segment 
values  given  an  estimate  of  displacement  without  the  necessity  of  obtaining 
an  entirely  new  segmentation.  If  we  are  willing  to  accept  a  processed  image 
with  quality  not  quite  suitable  for  viewing  but  suitable  for  analytic 
purposes,  then  we  have  met  our  goals  for  representing  motion. 

4.3  PRESENTATION  AND  PUBLICATIONS 

1,  C.  Yen,  Local  Correlation  Measures  Study  and  "Multi-Resolution,  Flow- 
Through"  Algorithm  for  Motion  Analysis,  M.  Sc.  Thesis,  Electrical, 
Computer  and  Systems  Engineering  Dept.,  Rensselaer  Polytechnic 
Institute,  Troy,  NY  12180,  August  1983. 

2.  P.  Burt,  "Multi-Resolution  Flow-Through  Motion  Analysis",  Proceedings 
of  Computer  Vision  and  Pattern  Recognition  Conference,  Washington, 


D.C.,  July  1983. 
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3.  S.  Schrier,  A  Multi-Resolution  Split  and  Link  Algorithm  for  Image 
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Electrical,  Computer  and  Systems  Engineering  Dept.,  Rensselaer 
Polytechnic  Institute,  Troy,  NY  12180,  May  1984. 
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5.  TEXTURE  MODELING  AND  DISCRIMINATION 

5.0  INTRODUCTION 

The  focus  of  this  task  has  been  the  development  of  stochastic  models 
for  texture  and  the  related  problems  of  developing  and  evaluating  various 
texture  classification  and  discrimination  algorithms.  Motivation  for  this 
work  has  been  the  following  important  image  processing  problem:  segment  an 
image  field  into  disjoint  regions  which  may  possess  the  same  average  gray 
level  but  differ  in  the  spatial  distribution  of  gray  levels.  These  two 
characteristics  of  an  image  are  generally  referred  to  as  tone  and  texture, 
respectively.  Our  interest  has  been  concerned  with  the  latter.  While 
somewhat  elusive  to  define  precisely,  the  most  generally  accepted  definition 
of  texture  at  present  is  that  it  consists  of  a  basic  local  order  or  quasi- 
homogeneous  pattern  which  is  repeated  in  a  "nearly"  periodic  manner  over 
some  image  region  which  is  large  relative  to  the  size  of  the  local  pattern. 

Our  work  in  this  area  can  be  conveniently  classified  into  three  broad 
categor ie  s : 

1. )  Develop  useful  stochastic  texture  models  and  characterize 

their  properties. 

2. )  Develop  model-based  texture  c 1  a s sif  ica t ion/di scr  ip  inu t ion 

algorithms  on  the  basis  of  statistical  decision  theory 
concepts. 

3. )  Evaluate  and  characterize  the  performance  of  various  texture 

classification/discrimination  algorithms  under  identical 
conditions. 


We  will  discuss  each  of  these  areas  separately. 


5.1  STOCHASTIC  TEXTURE  MODELS 


We  have  considered  several  classes  of  stochastic  texture  models 
including : 

1. )  2-D  autoregressive  moving  average  (ARMA)  processes. 

2. )  2-D  random  tessellation  processes. 

3. )  The  2-D  random  grain  model. 

4. )  2-D  Markov  random  fields  (MRF's). 

In  what  follows  we  will  describe  our  work  in  each  of  these  areas  separately. 


5.1.1  2-D  ARMA  MODELS 

These  models  have  been  used  extensively  in  one  form  or  another  in  a 
variety  of  image  processing  applications  although  a  comprehensive  systematic 
study  of  this  class  has  never  been  provided.  Specifically,  the  2-D  ARMA 
process  is  generated  recursively  according  to 

S.  .=  a,  S .  .  .  +  ”  b.  -W.  .  .  ,  i.j>l,  (5.1) 

1,3  (k.  )eP'  k’  1_k'J  '  (k,  )eV  k'"  1  k’J  ’• 
a  D 

where  {W.  .}  is  a  2-D  i.i.d.  sequence  and  both  V  and  V  are  specified 

i ,  j  a  b 

subsets  of  a  non-symmetr ical  half  plane  Z={k20, k20}  u  {k2l«k2-D*  Assuming 
a  row-by-row  raster  scanning  pattern,  the  sequence  {S^  ^ }  is  then  generated 
*s  8  causa  1  filtering  operation  over  past  outputs  and  present  and  past 
inputs.  A  special  case  of  (  5.1)  is  the  2-D  seoa rab  1  e  Gauss-Markov  model 
with 

Si.j=plSi-l.j+p2Si.j-rplP2Si-l.j-l+Wi,j’  (5-2) 


where  02 1  p  j  1 2*  <  i=1.2  represent  horizontal  and  vertical  pixel-to-p ixe  1  cor¬ 
relation  coefficients,  respectively,  and  (W  }  is  an  i.i.d.  zero-mean 

*  <  J 

+  The  prime,  as  in  V’  in  (2.1)  ,  is  intended  to  denote  exclusion  of  the 
point  (0,0). 
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2  2 

Gaussian  sequence  with  common  variance  w=  s (1-p^)  (1-p^ )  defined  in  terms  of 
the  variance  o-  of  the  assumed  stationary  sequence  {S^  ).  Similarly,  it's 
possible  to  define  a  2nd-order  senarabl e  2-D  Gaussian  process  as 


2  2, 

S,  .=  "  "  a.  S .  .  +W.  .  , 

x*j  k=o  ,=o  1'J 


where  again  {W.  }  is  an  i.i.d.  zero-mean  Gaussian  sequence.  The  cor- 

i.  J 

responding  coefficients  can  then  he  expressed  in  terms  of  pole  locations  in 
the  vertical  and  horizontal  spatial  frequency  planes  as  illustrated  in  Table 
5.1.  More  specifically,  r^  and  i=l,2,  are  the  radial  and  angular  posi¬ 

tions,  respectively,  of  the  separable  discrete  system  transfer  functions, 

H.(z),  generating  {S.  .}.  This  is  illustrated  in  Fig.  5.1. 

i  i,j 

It  can  be  shown  that  the  input  sequence  {W.  in  the  2nd-order  case 

must  possess  variance 


1-rf  1-r 


1+r,  /  \1+r 


(l-2r^cos291+r^)(l-2r2cos2e2'*-r2) . 


For  example,  if  either  H^(z)  or  H^z)  possesses  a  pole  on  the  unit  circle 

then  ~'=0.  We  assume,  of  course,  that  initial  conditions  are  chosen  to 

w 

result  in  stationary  conditions. 

Due  to  the  separability,  the  2-D  autocorrelation  sequence  is  given  by 

R(m,  n)  =  R  (m) R_ (n) ,  (5.5) 

w  1  2  • 

where 

R.(k)=rLrO  B.(z)H.(z-1)zk-1dz;  i=l,2.  (5.6) 

l  2jtj  Tii 

C1 

In  particular,  it's  easily  seen  that  the  correlation  coefficients  in  the 
vertical  and  horizontal  directions,  respectively,  are  now  given  by 
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Table  5.1 


Coefficient  Values  for  2-D  Second-Order  Auto¬ 
regressive  Process. 


-  A  8  - 


|  ;  i*i ,2. 

Note,  in  particular,  that  lp^(k)|_l  for  i=l,2.  However,  unlike  the  first- 
order  case,  where  p.(k)=Pj^^,  the  autocorrelation  is  no  longer 
exponentially  decreasing  in  k  due  to  the sinuso i dal  terms  within  brackets. 
This  property  can  be  useful  in  modeling  a  wider  range  of  textures  than  pos¬ 
sible  with  the  lst-order  model.  Additional  results  on  the  2nd-order  model 
can  be  found  in  [5.1]. 

Our  interest  in  ARMA  models  has  been  in  developing  their  statistical 
properties,  such  as  autocorrelation  functions,  joint  p.d.f's,  etc.,  and  in 
developing  methods  for  fitting  these  models  to  real-world  texture  samples. 
With  regard  to  the  latter  activities  we  have  been  specifically  concerned 
with  developing  m  a  x  i  mum-  1  ike  1  i  hood  (ML)  estimates  of  the  ARMA  model 
parameters,  particularly  when  observed  in  the  presence  of  added  noise. 

5.1.2  RANDOM  TESSELLATION  PROCESS 

This  class  of  random  fields  !ms-  been  studied  extensively  in  the  past 
(cf.  [5.2]  —  [  5.4])  and  this  aspect  of  the  investigation  has  constituted  an 
extension  of  past  efforts.  The  construction  procedure  we  have  considered 
results  in  a  tessellation  of  the  plane  into  convei  regions  by  an  ap¬ 
propriately  defined  field  of  random  lines  which  form  the  boundaries  of  these 
regions.  The  density  of  these  random  lines,  or  edges,  is  defined  in  terms 
of  a  rate  parameter  X.  Gray  levels  are  then  assigned  within  elementary 
regions  to  possess  correlation  coefficient  p  with  gray  levels  in  contiguous 
regions.  For  example,  the  gray  levels  can  be  generated  by  a  2-D  ARMA  model. 
Given  a  particular  partitioning  scheme  then,  the  random  fields  ire  com¬ 
pletely  defined  in  terms  of  the  two  parameters  X  and  p.  The  parameter  X 
represents  the  edge  business  associated  with  an  image  while  p  is  indica¬ 
tive,  at  least  on  an  ensemble  basis,  of  the  edge  contrast  .  For  p  large 
(in  magnitude)  and  negative  there  is  an  abrupt  almost  black-to-white  or 


pi(k) 


r]ki  f s1n[( | k|+1 )e . ]  -  rfs1n[( |k|-l Je,] 


sine . 
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white-to-black  transition  across  an  edge  boundary.  If  p>0,  on  the  other 
hand,  the  transition  across  an  edge  boundary  is  much  more  gradual.  It  is 
relatively  easy  to  define  these  parameters  for  wide  classes  of  imagery  data. 

In  previous  work  [5.2]  —  [5.4]  we  have  been  able  to  evaluate  the  second- 
order  properties  of  this  class  of  random  field  under  the  assumptions  that 
the  line  processes  generating  the  tessellation  are  themselves  generated  by 
either  a  Poisson  or  a  periodic  process.  It  can  be  shown  that  these  two  ex¬ 
amples  are  special  cases  of  renewal  point  processes  with  Gamma  distributed 

interarrival  distribution 
-1 

f(x)  =  *  exp{-x/(5};  x_0,  (5.8) 

me 

with  =1,2,...  and  p=  1  /  X  for  f  ixed  X>0.  For  =1  we  have  f(x)  =  e  ^X,  cor-* 
responding  to  the  ca»c  of  Poisson  partitions,  while  for  .=“,  f (x)=Mx-l/X) 
corresponding  to  periodic  partitions  with  fixed  spacing  =1/X.  Clearly,  the 
quantity  plays  the  role  of  a  homogeneity  parameter.  For  small  the 
mosaic  appearance  is  that  of  a  tiling  of  the  plane  by  random  rectangles.  As 
increases  the  tiling  becomes  more  and  more  regular  or  periodic. 

In  [5.5]  we  have  been  able  to  develop  the  second-order  properties  of 
the  2-D  random  tessellation  process  when  the  tessellation  is  generated  by  a 
pair  of  mutually  independent  renewal  point  processes  with  Gamma  distributed 
interarrival  distribution  as  given  by  (5.8).  This  work  then  represents  a 
considerable  extension  of  our  abilities  to  characterize  the  second-order 
properties  of  the  2-D  random  tessellation  process. 

Other  methods  have  been  studied  as  part  of  this  effort,  including  the 
use  of  the  Voronoi  and  Johnson-Mehl  tessellations  [5.6],  [5.7],  to  provide 
the  underlying  partitioning  of  the  plane.  In  the  Voronoi  tessellation  the 
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centers  of  regions,  called  seeds,  are  distributed  according  to  a  2-D  spatial 
point  process  with  rate  parameter  X.  The  elementary  geometric  regions  then 
consist  of  all  points  closer  to  a  given  seed  than  to  any  other  seed.  In  the 
Johnson-Mehl  tessellation,  the  seeds  are  distributed  according  to  a  space- 
time  point  process  with  spatial  rate  parameter  X  and  temporal  rate  parameter 
v.  The  elementary  regions  then  grow  isotropically  with  spatial  velocity 
In  this  case  some  seeds  may  fail  to  generate  regions  if  their  spatial  loca¬ 
tion  is  overcome  by  regions  associated  with  previously  distributed  seeds. 
While  these  partitioning  schemes  represent  interesting  generalizations  of 
the  2-D  random  tessellation  process,  we  have  met  with  little  success  in 
characterizing  the  resulting  second-order  properties. 

5.1.3  2-D  RANDOM  GRAIN  MODEL 

This  class  of  image  models  have  alternatively  been  called  random  grain 
models.  Boolean  Models,  bombing  models,  etc.  Our  approach  has  been  to  con¬ 
sider  these  models  as  the  output  of  suitably  defined  spatial  filters  excited 
by  a  marked  spatial  point  process  f  5  -  8 1  at  its  input.  For  example,  suppose 
that  the  filter  is  space  invariant  and  can  be  described  in  terms  of  its 
point  spread  function  h(g)  .  Consider  exciting  this  filter  by  a  marked  spa¬ 
tial  point  process  described  in  terms  of  the  triple  {x  ,  a.,  .)  where  x  1 

~iii  “i 

are  the  event  locations,  and  { a  ^ }  and  {  O  are  marks  representing  amplitude 
and  rotation  to  be  assigned  to  the  characteristic  response  of  the  filter  to 
the  spatial  impulse  at  the  corresponding  locations  (x^).  The  field  at  the 
filter  output  is  then  described  by 

f  (g)  =  a . h . ( g-g . ) ,  (.91 

.tit 

i 

where  hj(g)=h(A^g)  with  V  the  unitary  matrix 

.  cos  .:  m 

!  1  ; 

A  = 

=  1 

-sill  r-OS 

i  ;  - 


The  net  result*  are  that  {f(j),  jgR  }  is  given  by  the  superposition  of  *  num¬ 
ber  of  randomly  weighted  *nd  rotated  versions  of  the  original  point  spread 
function  h(j)  replicated  at  random  spatial  positions.  In  this  sense  the 
resulting  random  field  resembles  the  so-called  low-density  shot  processes 
evolving  according  to  a  single  temporal  parameter.  The  description  of  the 
field  in  terms  of  filtering  operations  not  only  provides  a  phenometno  logical 
construction  procedure  for  generating  the  field  but  allows  the  application 
of  existing  and  we  1 1 -de ve 1  ope d  techniques  for  analyzing  the  statistical 
properties  of  1-D  low-density  shot  processes. 

This  class  of  2-D  random  fields  can  be  extended  to  include  nonisometric 

transformations  of  a  basic  point  spread  funct.on  h(x).  Here  the  field 
2 

(f(x),jeR  1  is  obtained  as  a  randomly  weighted  superposition  of  versions  of 
h(j)  which  has  undergone  scaling  in  addition  to  the  rigid  body  motions  of 
translation  and  rotation.  As  an  illustrative  example,  suppose  the  field  is 


described  according  to  (2.9)  and  the  sequel  won  now 


_1  r  cos 8 ^  sine . 

1  L-S  sine.  ;T  coseJ 


with  b^.  ’  appropriately  defined  random  variables  subject  only  to  the  con¬ 

straints  b.  0  and  0.1.  This  transformation  has  an  interpretation  as  a 
i  “  *  r 

rotation  by  radians  followed  by  separate  scaling  of  tbe  orthogonal  car¬ 
tesian  coordinate  axes.  Furthermore,  suppose  that  the  basic  point  spread 


function  h(i)  is  described  by 


n(  i  1  ’  ^ 

h(x)«  < 

_0  ;  ’ : k I  j >1 

2  2  2 

where  llgll  -<l,g>**j+Xj  is  the  ordinary  Euclidean  norm.  The  corresponding 
response  b  (g) =h(A is  then  elliptical  in  shape  as  illustrated  in  Fig. 


Figure  5.2 

Characteristic  Response  of  Random  Grain 
Stochastic  Image  Model 


with  major  and  minor  axes  b.  and  r,.b.,  respectively,  each  rotated  by 

111  1 

radians. 

Other  choices  for  the  basic  point  spread  function  h(x)  are,  of  course, 
possible.  For  example,  consider  the  sinusoidal  response 


h(x) 


sin{ | |x|  |  > 


(  5.13) 


or  even  the  exponential  response 

h(x)  *  exp{- | |xj  |2}  .  ( 5‘ 14) 

A  reasonably  complete  characterization  of  the  2-D  random  grain  model 
has  been  provided  in  {5.9].  This  has  included  evaluation  of  second-order 
properties. 

5.1.4  2-D  MARKOV  RANDOM  FIELDS 

Recent  developments  in  the  theory  of  2-D  MRF's  provide  a  powerful  al¬ 
ternative  to  existing  stochastic  fields  as  a  texture  model.  The  2-D  MRF 
attempts  to  extend  the  familiar  1-D  Markov  property  to  2-D.  More  specifi¬ 
cally,  a  2-D  MRF  is  defined  by  the  property  that  the  conditional  p.d.f.  of 
the  gray  level  at  any  point  in  the  image  plane  given  values  of  all  the  other 
points  depends  only  upon  the  gray  levels  at  pixel  positions  in  a  specified 
neighborhood  of  the  point  in  question. 

Through  its  relationship  with  the  so-called  Gibbs  random  field  (GRF), 
it  can  be  shown  that  the  joint  p.d.f.  of  the  MRF  defined  on  a  specified  lat¬ 
tice  assumes  a  precise  functional  form.  Furthermore,  this  functional  form 
can  generally  be  defined  in  terms  of  a  small  number  of  physically  meaningful 
parameters.  These  parameters  can  then  be  adjusted  so  that  the  resulting 
realizations  of  the  MRF  closely  resemble  real-world  textures. 

A  detailed  treatment  of  the  construction  and  properties  of  MRF's  ap¬ 
pears  in  {5.101.  Here  we  demonstrated  how  the  parameters  of  the  MRF  can  be 
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matched  to  real-world  textures  by  ML  estimation  techniques.  In  this  work  we 
also  developed  ML  techniques  for  texture  classification  and  segmentation  and 
demonstrated  the  efficacy  of  these  techniques  through  experiments  on  real- 
world  texture  samples. 

5.2  OPTIMUM  MODEL-BASED  TEXTURE  CLASSIFICATION/ 

DISCRIMINATION  ALGORITHMS 

Our  efforts  in  this  area  have  been  concerned  with  developing  ML-based 
texture  classification/discrimination  schemes  based  upon  specific  stochastic 
texture  modeling  assumptions.  This  work  was  begun  in  [5.11]  where  use  was 
made  of  the  2-D  random  tessellation  process  described  previously.  The  ap¬ 
proach  was  to  first  reduce  the  data  to  its  spatial  gray-level  co-occurrence 
matrix.  Then  ML  classification/discrimination  schemes  were  developed  using 
as  observations  the  elements  of  the  spatial  gray-level  co-occurrence  matrix. 
Under  some  ergodic  assumptions  on  the  underlying  tessellation  model,  it  was 
possible  to  explicitly  compute  the  c  1  a s s- c ond i t ona  1  1  og- 1  ike  1 i hood 
functionals.  This  approach  has  been  shown  to  be  quite  effective  when  ap¬ 
plied  to  real-world  imagery. 

In  [  5.12]  this  initial  aj  proach  was  extended  to  arbitrary  stochastic 
texture  models  through  an  adaptive  technique  for  estimating  the  unknown 
parameters  appearing  in  the  class-conditional  log-likelihood  functional. 
This  approach  was  shown  to  provide  excellent  classification  performance  on 
real-world  texture  samples.  The  difficulty  was  that  it  required  training 
data  consisting  of  homogeneous  observations  from  each  of  the  possibly  un¬ 
known  texture  classes.  The  approach  then,  while  useful  lor  texture 
classification,  was  not  useful  for  texture  segmentation  since  homogeneous 
training  data  is  generally  not  available. 
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As  part  of  the  present  effort  we  developed  a  technique  for  estimating 
the  structural  parameters  of  the  underlying  stochastic  image  model  directly 
from  the  data  itself.  These  estimates  are  then  used  in  the  technique  of 
[5.111.  with  some  additional  simplifications,  to  provide  a  fully  adaptive 
model-based  approach  to  texture  segmentation.  The  results  are  described  in 
[5.131 . 

A  second  major  contribution  of  this  effort  has  been  to  develop  ML-based 
texture  classification/discrimination  techniques  using  a  MRF  model  of 
texture.  This  approach,  as  reported  in  [5.10],  is  shown  to  be  quite  effec¬ 
tive  in  the  classification/discrimination  of  real-world  textures  at  a 
reasonably  small  computational  expense.  It's  felt  that  this  approach  offers 
an  attractive  performance/complexity  alternative  to  the  schemes  described  in 
[5.111  —  [5. 13] . 

5.3  EVALUATION  AND  CHARACTERIZATION  OF  TEXTURE  CLASSIFICATION/ 

DISCRIMINATION  ALGORITHMS 

In  this  work  we  have  attempted  to  explore  the  performance  of  various 
existing  texture  discriminants  operating  on  identical  and  carefully  control¬ 
led  stochastic  texture  models.  More  specifically,  we  have  concentrated  upon 
the  texture  features  described  by  Haralick  [5.14].  These  features,  which 
are  extracted  from  the  spatial  gray-level  co-occurrence  matrix,  were 
originally  proposed  on  an  ad  hoc  basis.  Unfortunately,  there  is  little  in¬ 
formation  available  on  how  to  choose  critical  parameters,  such  as  the 
separation  distance,  d.  Furthermore,  there  has  never  been  any  comprehensive 
work  performed  to  establish  the  conditions  (e.g.,  the  class  of  underlying 
stochastic  texture  models)  under  which  these  features  provide  effective  dis¬ 
crimination  and  conditions  under  which  they  can  be  expected  to  fail. 
Finally,  there  has  never  been  any  work  directed  toward  establishing  the 


relative  performance  of  texture  classification/discrimination  procedures 
based  on  the  Haralick  features  compared  to  optimum  model-based  algorithms, 
sncb  as  developed  separately  under  this  effort. 

In  [5.15]  we  describe  the  results  of  a  comprehensive  study  of  the  rela¬ 
tive  performance  of  the  Haralick  features  for  a  variety  of  stochastic 
texture  modeling  assumptions.  This  work  has  included  ARMA  models,  as  well 
as  2-D  random  tessellation  processes.  The  results  are  quite  useful  in 
providing  information  on  how  to  choose  key  parameters  for  the  Haralick  fea¬ 
tures  and  in  identifying  conditions  under  which  the  Haralick  features  either 
provide  effective  discrimination  or  fail  to  do  so. 
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6. 


EFFICIENT  ALGORITHMS  FOR  PHOTO-INTERPRETATION 


The  results  of  this  task  are  reported  in  two  documents.  Removal  of 
Cloud  Shadows  from  Aerial  Photographs  by  S-P.  Shu  [6.3-1]  and  ISES  -  An 
Image  Segmentation  Expert  System  for  Aerial  Photographs  Based  on  the  Use  of 
Edge  and  Texture  Features  by  S-P.  She  [6.3-2].  We  shall  present  summaries 
of  the  work  from  the  documents. 


t 
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< 
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6.1  CLOUD  COVER  REMOVAL  AND  COMPENSATION 

i 

i 

1 

The  objective  of  this  project,  the  results  of  which  are  reported  in  I 

[6.3-1],  is  to  develop  an  algorithm  compensating  for  extreme  variation  in  j 

scene  illumination  caused  by  cloud  shadows  in  order  to  obtain  a  uniformly 

bright  image  display.  A  sample  aerial  photograph  was  provided  for  this  1 

purpose  as  well  as  two  reference  points,  one  underexposed  due  to  cloud 
shadows  and  one  overexposed. 

An  elegant  algorithm  has  been  developed  successfully.  The  algorithm 
can  determine  cloud  shadow  edges  under  various  criteria,  divide  the  image  ( 

into  partial  images  along  the  cloud  shadows  edges,  compensate  for  the  | 

l 

brightness  of  each  partial  image,  reconnect  the  compensated  partial  images 
into  one  high-quality  image  without  visible  connection  edges  and  add 

heuristics  to  save  computation  time  and  improve  the  quality  of  the  resultant  i 

image.  The  Cloud  Shadow  Modeling  technique  is  used  to  model  cloud  shadows  I 

in  aerial  images  by  mathematical  expressions  and  to  add  artificial  cloud 
shadows  into  aerial  images. 


j 
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CLOUD  SHADOW  REMOVAL  SCHEME 


The  Cloud  Shadow  Removal  Scheme  is  illustrated  by  the  following  steps; 

(1)  Get  Input  Sample  Image  I^(i,j) 

An  input  sample  image,  such  as  Fig.  1,  is  obtained  by  digitizing 
the  film  of  the  sample  aerial  photograph  using  the  Prime  Computer 
System  in  the  RPI  Image  Processing  Lab. 

(2)  Determined  Cloud  Shadow  Edges 

A  fuzzy-edge  detection  scheme,  developed  by  the  author,  is  used  to 
determine  the  tight  and  loose  bounds  of  cloud  shadow  edges  in  the  image 
I.  (i.j). 

(3)  Divide  Sample  Image  Along  Cloud  Shadow  Edges 

The  image  I^(i,j)  is  divided  into  left  and  right  part  images, 

denoted  by  I  (i.j)  and  I  (i.j)  respectively.  The  overlay  image  from 
L  R 

I  (i.j) and  T  (i.j)  is  denoted  Iw (i.j). 

L  K  M 

(A)  Compensate  Image  Brightness 

The  images  I  (i.j),  and  I  (i.j)  are  compensated  by  algebraic  gray- 

L  n 

level  transformation,  histogram  equalization,  and  mean  &  variance 
transformation  techniques.  The  compensated  images  are  denoted  I^^(i.j) 

IrI (i.j )  and  respectively. 

(5)  Recombine  Partial  Image 

The  compensated  images  I  . ,  ID1,  and  I  .  are  recombined  into  one 

L 1  K1  Ml 

image  I^(i,j)  as  in  Fig.  3  by  the  overlay  image  connection  technique. 

This  technique  was  developed  by  the  author  for  recombining  partial 
images  into  one  image  without  visible  connection  edges. 


The  algorithm  for  the  removal  of  cloud  shadows  from  aerial  photograph 
has  been  successfully  implemented  on  the  Prime  Computer  System  in  the  RPI 
Image  Processing  Lab. 

The  provided  sample  image  is  shown  in  Fig.  1.  By  this  algorithm,  the 
resultant  output  image  is  obtained  as  shown  in  Figs.  2  and  3,  in  which  the 
right  part  cloud  shadow  was  removed  and  the  entire  image  was  enhanced  to 
higher  quality. 
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Fig.  i 

Original  Sample  Image  I^_| 
Resolution:  512  x  512  pixels 
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Resolution:  512  x  512  pixels 
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to  try  to  develop  an  image-segmentation  system  that  would  have  the  same 
advantages  but  without  the  disadvantages.  The  ISF.S  presented  here  comes 
close  to  achieving  the  goal,  having  achieved  excellent  experimental 
results . 

In  the  ISES,  the  A*  algorithm  (AI  minimum-cost-graph  search)  war. 
applied  for  both  heuristic  edge  extraction  (HEF.)  as  well  as  for  region 
formation  ( .- F  )  .  The  A*  algorithm  is  a  powerful  scheme  in  AT  for  searching, 
desirable  paths  under  a  well-defined  cost  function.  The  cost  function  in  FF 
was  defined  with  both  edge-features  and  texture-feature  paiamcU  rs,  thus 
permitting  the  RF  procedure  to  use  edge  and  texture  features  simultaneously 
for  optimal  path  search.  This  approach  d  em  on  s  t  r  a  *  *•<!  the  power  <  t  A' 
approach  in  simultaneously  handling  va  r  i  >us  image  h<*u  r  i  st  i  s  rut  h  as  edg* 
and  texture  features.  Rased  on  the  simultaneous  use  <.f  edge  «•»:<!  texti.j-* 
features  under  a  rule-based  expert-system  cent  r  <-l  ,  the  FF  j  r  eser.t  e-1  t  he 
first  solution  for  eliminating  edge  gaps  and  mil  t  <  r  i  •  :  h  y  edge;;.  Then*  edg. 
gap  and  micro-edge  problems  had  not  been  solved  hef.  r  «  h\  atv  F  '  r;  I  !  •  *.  g . 

detection  techniques. 

The  localized  brightness  compensation  (LR<‘ '  it.  t  he  IFF.1'  :  t  .  *  r 

good  preprocessing  scheme  for  the  IFF?,  hut  is  a  s  ■  a  use}  i. 1  gen.  •  .1  .  r  . 
enhanc  emen  t  scheme  .  This  generalized  I.BC  f  «  r  ■  ma  g  e  <■  r  h  a  to  it  <•  t  •  .  '  i  . 

first  localized  i  mage  -  en  banc  emen  t  scheme.  !  t  is  a  1  t  h«-  f  :  t  :•  *  .  r  age 

enhancement  algorithm  using  rule-based  expet t  sv;  t em:  .q  [  t  t  si. 

The  H  EE  in  the  ISES  is  a  new  useful  edge /bout-da  r  y  d«  t  •*<  t  r « >t  !  cm  !  >  t 
obtaining  one-p  ixel  -wide  edges  or  boundaries  f  r  on;  real  images.  t".>i:q  a  *  <  d  t  . 
texture  classification,  the  HEF.  provided  mix  h  mire  spatially  ai.urat.  •  eg  r 
boundaries.  The  one-pixel  -wide  edges/boundar  i**s  can  he  easily  eta  >  del  -eti 
chain  codes  for  various  image  processing  applications.  It  ;  s  ,  lent  that  a 
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one-p ixel-wide  edge/boundary  is  more  useful  than  a  non- one  -  p  ixel -w  1  d  e  edge 
boundary.  The  edge  thinning  operation  in  the  HEE  is  a  new  efficient 
algorithm  for  obtaining  one-p ixel -w ide  edges  or  boundaiies. 

The  knowledge-based  controller  in  the  ISES  is  a  rule  based  exj-.-tt 

I 

system  model.  It  controls  all  the  processing  of  the  1  SET ,  f  r  .  m  cat  1  v 
!  manipulating  t  he  image  processing  tools  M.BC,  HEE.  and  PE  ,  thtiugt. 

|  modifying  the  data  set  in  the  1  mage  tact  base.  **  obtaining  a  E.gi  ,’ial  .  t  •. 
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